Welcome to the Intro to R Programming Workshop!
This is the solutions notebook. Only check this out if you want to see the solutions to the exercises!
For Part I click here
For Part II click here.
Link to Slides: https://favstats.github.io/ds3_r_intro/
The following includes a list of exercises that you can complete on your own.
Take a look at the table below.
Pick three animals from the Animal Lifespan data we haven’t talked about yet.
Assign the lifespan values to respective objects with appropriate names.
| Animal | Maximum Longevity (in years) |
|---|---|
| Human | 122.5.5 |
| Domestic dog | 24.0 |
| Domestic cat | 30.0 |
| American alligator | 77.0 |
| Golden hamster | 3.9 |
| King penguin | 26.0 |
| Lion | 27.0 |
| Greenland shark | 392.0 |
| Galapagos tortoise | 177.0 |
| African bush elephant | 65.0 |
| California sea lion | 35.7 |
| Fruit fly | 0.3 |
| House mouse | 4.0 |
| Giraffe | 39.5 |
| Wild boar | 27.0 |
giraffe_lifespan <- 29.5
penguin_lifespan <- 26
elephant_lifespan <- 65
Create three (different) logical tests which compare the maximum longevity between your chosen animal lifespans.
Does the output you get make sense?
giraffe_lifespan == penguin_lifespan
## [1] FALSE
giraffe_lifespan > penguin_lifespan
## [1] TRUE
elephant_lifespan != penguin_lifespan
## [1] TRUE
Create two vectors with the help of c():
theanimals <- c("giraffe", "penguin", "elephant")
lifespans <- c(giraffe_lifespan, penguin_lifespan, elephant_lifespan)
Calculate the mean of your lifespan vector.
mean(lifespans)
## [1] 40.16667
5.1 Retrieve the second value of the vector that contains your animal names.
Tip: Square brackets are your friend.
theanimals[2]
## [1] "penguin"
5.2 Using code, find out which animals in your lifespans vector have a maximum longevity of above 25.
Tip: For an elegant solution you need to use both vectors, square brackets and a logical test. If you need help revisit Indexing with logical tests
theanimals[lifespans > 25]
## [1] "giraffe" "penguin" "elephant"
Calculate the animal to human conversion ratios for the animals you’ve picked and assign the results to an object.
conversions <- 122.5/lifespans
Calculate the human years for your picked animals and assume they are all 5 years old.
conversions*5
## [1] 20.762712 23.557692 9.423077
Pick one of the animals you chose and create a function which takes as input animal years and outputs human years. Test the function and validate with results from the seventh exercise.
You can name the function in this style:
[you_animal_name]_to_human_years
Tip: If you need help revisit the section Dog to Human years function
Create the function here:
penguin_to_human_years <- function(animal_years, human_lifespan = 122.5, penguin_lifespan = 26){
ratio <- human_lifespan/penguin_lifespan
human_years <- animal_years*ratio
return(human_years)
}
Try it out here:
penguin_to_human_years(5)
## [1] 23.55769
The following includes a list of exercises that you can complete on your own.
We are going to use the palmerpenguins dataset for the tasks ahead!
For reference, here is a list of some useful functions.
If you have trouble with any of these functions, try reading the documentation with ?function_name
Remember: all these functions take the data first.
filter()
mutate()
rename()
select()
summarise(); summarize()
group_by(); ungroup()
arrange()
count(); tally()
distinct()
pull()
ifelse()
case_when()
ifelse is not enough)separate()
pivot_wider()
pivot_longer()
Load the tidyverse and janitor packages.
If janitor is not installed yet (it will say janitor not found) install it.
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.0 v dplyr 1.0.1
## v tidyr 1.0.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(janitor)
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
Read in the already cleaned palmerpenguins dataset using
read_csvAssign the resulting data to penguins.
Then take a look a look at it using glimpse.
What kind of variables can you recognize?
penguins <- read_csv("https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv")
## Parsed with column specification:
## cols(
## species = col_character(),
## island = col_character(),
## bill_length_mm = col_double(),
## bill_depth_mm = col_double(),
## flipper_length_mm = col_double(),
## body_mass_g = col_double(),
## sex = col_character(),
## year = col_double()
## )
glimpse(penguins)
## Rows: 344
## Columns: 8
## $ species <chr> "Adelie", "Adelie", "Adelie", "Adelie", "Adelie",...
## $ island <chr> "Torgersen", "Torgersen", "Torgersen", "Torgersen...
## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34....
## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18....
## $ flipper_length_mm <dbl> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, ...
## $ body_mass_g <dbl> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 347...
## $ sex <chr> "male", "female", "female", NA, "female", "male",...
## $ year <dbl> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2...
Only keep the variables: species, island and sex.
select(penguins, species, island, sex)
## # A tibble: 344 x 3
## species island sex
## <chr> <chr> <chr>
## 1 Adelie Torgersen male
## 2 Adelie Torgersen female
## 3 Adelie Torgersen female
## 4 Adelie Torgersen <NA>
## 5 Adelie Torgersen female
## 6 Adelie Torgersen male
## 7 Adelie Torgersen female
## 8 Adelie Torgersen male
## 9 Adelie Torgersen <NA>
## 10 Adelie Torgersen <NA>
## # ... with 334 more rows
penguins %>%
select(species, island, sex)
## # A tibble: 344 x 3
## species island sex
## <chr> <chr> <chr>
## 1 Adelie Torgersen male
## 2 Adelie Torgersen female
## 3 Adelie Torgersen female
## 4 Adelie Torgersen <NA>
## 5 Adelie Torgersen female
## 6 Adelie Torgersen male
## 7 Adelie Torgersen female
## 8 Adelie Torgersen male
## 9 Adelie Torgersen <NA>
## 10 Adelie Torgersen <NA>
## # ... with 334 more rows
Only keep variables 2 to 4.
select(penguins, 2:4)
## # A tibble: 344 x 3
## island bill_length_mm bill_depth_mm
## <chr> <dbl> <dbl>
## 1 Torgersen 39.1 18.7
## 2 Torgersen 39.5 17.4
## 3 Torgersen 40.3 18
## 4 Torgersen NA NA
## 5 Torgersen 36.7 19.3
## 6 Torgersen 39.3 20.6
## 7 Torgersen 38.9 17.8
## 8 Torgersen 39.2 19.6
## 9 Torgersen 34.1 18.1
## 10 Torgersen 42 20.2
## # ... with 334 more rows
penguins %>%
select(2:4)
## # A tibble: 344 x 3
## island bill_length_mm bill_depth_mm
## <chr> <dbl> <dbl>
## 1 Torgersen 39.1 18.7
## 2 Torgersen 39.5 17.4
## 3 Torgersen 40.3 18
## 4 Torgersen NA NA
## 5 Torgersen 36.7 19.3
## 6 Torgersen 39.3 20.6
## 7 Torgersen 38.9 17.8
## 8 Torgersen 39.2 19.6
## 9 Torgersen 34.1 18.1
## 10 Torgersen 42 20.2
## # ... with 334 more rows
Remove the column year.
select(penguins, -year)
## # A tibble: 344 x 7
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 1 more variable: sex <chr>
penguins %>%
select(-year)
## # A tibble: 344 x 7
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 1 more variable: sex <chr>
Only include columns that contain “mm” in the variable name.
select(penguins, contains("mm"))
## # A tibble: 344 x 3
## bill_length_mm bill_depth_mm flipper_length_mm
## <dbl> <dbl> <dbl>
## 1 39.1 18.7 181
## 2 39.5 17.4 186
## 3 40.3 18 195
## 4 NA NA NA
## 5 36.7 19.3 193
## 6 39.3 20.6 190
## 7 38.9 17.8 181
## 8 39.2 19.6 195
## 9 34.1 18.1 193
## 10 42 20.2 190
## # ... with 334 more rows
penguins %>%
select(contains("mm"))
## # A tibble: 344 x 3
## bill_length_mm bill_depth_mm flipper_length_mm
## <dbl> <dbl> <dbl>
## 1 39.1 18.7 181
## 2 39.5 17.4 186
## 3 40.3 18 195
## 4 NA NA NA
## 5 36.7 19.3 193
## 6 39.3 20.6 190
## 7 38.9 17.8 181
## 8 39.2 19.6 195
## 9 34.1 18.1 193
## 10 42 20.2 190
## # ... with 334 more rows
Rename island to location.
select(penguins, location = island)
## # A tibble: 344 x 1
## location
## <chr>
## 1 Torgersen
## 2 Torgersen
## 3 Torgersen
## 4 Torgersen
## 5 Torgersen
## 6 Torgersen
## 7 Torgersen
## 8 Torgersen
## 9 Torgersen
## 10 Torgersen
## # ... with 334 more rows
penguins %>%
rename(location = island)
## # A tibble: 344 x 8
## species location bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgers~ 39.1 18.7 181 3750
## 2 Adelie Torgers~ 39.5 17.4 186 3800
## 3 Adelie Torgers~ 40.3 18 195 3250
## 4 Adelie Torgers~ NA NA NA NA
## 5 Adelie Torgers~ 36.7 19.3 193 3450
## 6 Adelie Torgers~ 39.3 20.6 190 3650
## 7 Adelie Torgers~ 38.9 17.8 181 3625
## 8 Adelie Torgers~ 39.2 19.6 195 4675
## 9 Adelie Torgers~ 34.1 18.1 193 3475
## 10 Adelie Torgers~ 42 20.2 190 4250
## # ... with 334 more rows, and 2 more variables: sex <chr>, year <dbl>
Filter the data so that species only includes Chinstrap.
filter(penguins, species == "Chinstrap")
## # A tibble: 68 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Chinst~ Dream 46.5 17.9 192 3500
## 2 Chinst~ Dream 50 19.5 196 3900
## 3 Chinst~ Dream 51.3 19.2 193 3650
## 4 Chinst~ Dream 45.4 18.7 188 3525
## 5 Chinst~ Dream 52.7 19.8 197 3725
## 6 Chinst~ Dream 45.2 17.8 198 3950
## 7 Chinst~ Dream 46.1 18.2 178 3250
## 8 Chinst~ Dream 51.3 18.2 197 3750
## 9 Chinst~ Dream 46 18.9 195 4150
## 10 Chinst~ Dream 51.3 19.9 198 3700
## # ... with 58 more rows, and 2 more variables: sex <chr>, year <dbl>
penguins %>%
filter(species == "Chinstrap")
## # A tibble: 68 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Chinst~ Dream 46.5 17.9 192 3500
## 2 Chinst~ Dream 50 19.5 196 3900
## 3 Chinst~ Dream 51.3 19.2 193 3650
## 4 Chinst~ Dream 45.4 18.7 188 3525
## 5 Chinst~ Dream 52.7 19.8 197 3725
## 6 Chinst~ Dream 45.2 17.8 198 3950
## 7 Chinst~ Dream 46.1 18.2 178 3250
## 8 Chinst~ Dream 51.3 18.2 197 3750
## 9 Chinst~ Dream 46 18.9 195 4150
## 10 Chinst~ Dream 51.3 19.9 198 3700
## # ... with 58 more rows, and 2 more variables: sex <chr>, year <dbl>
Filter the data so that species only includes Chinstrap or Gentoo.
filter(penguins, species %in% c("Chinstrap", "Gentoo"))
## # A tibble: 192 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Gentoo Biscoe 46.1 13.2 211 4500
## 2 Gentoo Biscoe 50 16.3 230 5700
## 3 Gentoo Biscoe 48.7 14.1 210 4450
## 4 Gentoo Biscoe 50 15.2 218 5700
## 5 Gentoo Biscoe 47.6 14.5 215 5400
## 6 Gentoo Biscoe 46.5 13.5 210 4550
## 7 Gentoo Biscoe 45.4 14.6 211 4800
## 8 Gentoo Biscoe 46.7 15.3 219 5200
## 9 Gentoo Biscoe 43.3 13.4 209 4400
## 10 Gentoo Biscoe 46.8 15.4 215 5150
## # ... with 182 more rows, and 2 more variables: sex <chr>, year <dbl>
penguins %>%
filter(species %in% c("Chinstrap", "Gentoo"))
## # A tibble: 192 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Gentoo Biscoe 46.1 13.2 211 4500
## 2 Gentoo Biscoe 50 16.3 230 5700
## 3 Gentoo Biscoe 48.7 14.1 210 4450
## 4 Gentoo Biscoe 50 15.2 218 5700
## 5 Gentoo Biscoe 47.6 14.5 215 5400
## 6 Gentoo Biscoe 46.5 13.5 210 4550
## 7 Gentoo Biscoe 45.4 14.6 211 4800
## 8 Gentoo Biscoe 46.7 15.3 219 5200
## 9 Gentoo Biscoe 43.3 13.4 209 4400
## 10 Gentoo Biscoe 46.8 15.4 215 5150
## # ... with 182 more rows, and 2 more variables: sex <chr>, year <dbl>
Filter the data so it includes only penguins that are male and of the species Adelie.
filter(penguins, sex == "male" & species == "Adelie")
## # A tibble: 73 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.3 20.6 190 3650
## 3 Adelie Torge~ 39.2 19.6 195 4675
## 4 Adelie Torge~ 38.6 21.2 191 3800
## 5 Adelie Torge~ 34.6 21.1 198 4400
## 6 Adelie Torge~ 42.5 20.7 197 4500
## 7 Adelie Torge~ 46 21.5 194 4200
## 8 Adelie Biscoe 37.7 18.7 180 3600
## 9 Adelie Biscoe 38.2 18.1 185 3950
## 10 Adelie Biscoe 38.8 17.2 180 3800
## # ... with 63 more rows, and 2 more variables: sex <chr>, year <dbl>
penguins %>%
filter(sex == "male" & species == "Adelie")
## # A tibble: 73 x 8
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.3 20.6 190 3650
## 3 Adelie Torge~ 39.2 19.6 195 4675
## 4 Adelie Torge~ 38.6 21.2 191 3800
## 5 Adelie Torge~ 34.6 21.1 198 4400
## 6 Adelie Torge~ 42.5 20.7 197 4500
## 7 Adelie Torge~ 46 21.5 194 4200
## 8 Adelie Biscoe 37.7 18.7 180 3600
## 9 Adelie Biscoe 38.2 18.1 185 3950
## 10 Adelie Biscoe 38.8 17.2 180 3800
## # ... with 63 more rows, and 2 more variables: sex <chr>, year <dbl>
Create three new variables that calculates bill_length_mm and bill_depth_mm and flipper_length_mm from milimeter to centimeter.
Tip: divide the length value by 10.
mutate(penguins,
bill_length_cm = bill_length_mm/10,
bill_depth_cm = bill_depth_mm/10,
flipper_length_cm = flipper_length_mm/10
)
## # A tibble: 344 x 11
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 5 more variables: sex <chr>, year <dbl>,
## # bill_length_cm <dbl>, bill_depth_cm <dbl>, flipper_length_cm <dbl>
penguins %>%
mutate(bill_length_cm = bill_length_mm/10,
bill_depth_cm = bill_depth_mm/10,
flipper_length_cm = flipper_length_mm/10)
## # A tibble: 344 x 11
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 5 more variables: sex <chr>, year <dbl>,
## # bill_length_cm <dbl>, bill_depth_cm <dbl>, flipper_length_cm <dbl>
Create a new variable called bill_depth_cat which has two values:
mutate(penguins, bill_depth_cat = ifelse(bill_depth_mm >= 18, "high", "low"))
## # A tibble: 344 x 9
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 3 more variables: sex <chr>, year <dbl>,
## # bill_depth_cat <chr>
penguins %>%
mutate(bill_depth_cat = ifelse(bill_depth_mm >= 18, "high", "low"))
## # A tibble: 344 x 9
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 3 more variables: sex <chr>, year <dbl>,
## # bill_depth_cat <chr>
Create a new variable called species_short.
Adelie should become AChinstrap should become CGentoo should become Gmutate(penguins,
island_short = case_when(
species == "Adelie" ~ "A",
species == "Chinstrap" ~ "C",
species == "Gentoo" ~ "G",
))
## # A tibble: 344 x 9
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 3 more variables: sex <chr>, year <dbl>,
## # island_short <chr>
penguins %>%
mutate(island_short = case_when(
species == "Adelie" ~ "A",
species == "Chinstrap" ~ "C",
species == "Gentoo" ~ "G",
))
## # A tibble: 344 x 9
## species island bill_length_mm bill_depth_mm flipper_length_~ body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torge~ 39.1 18.7 181 3750
## 2 Adelie Torge~ 39.5 17.4 186 3800
## 3 Adelie Torge~ 40.3 18 195 3250
## 4 Adelie Torge~ NA NA NA NA
## 5 Adelie Torge~ 36.7 19.3 193 3450
## 6 Adelie Torge~ 39.3 20.6 190 3650
## 7 Adelie Torge~ 38.9 17.8 181 3625
## 8 Adelie Torge~ 39.2 19.6 195 4675
## 9 Adelie Torge~ 34.1 18.1 193 3475
## 10 Adelie Torge~ 42 20.2 190 4250
## # ... with 334 more rows, and 3 more variables: sex <chr>, year <dbl>,
## # island_short <chr>
Calculate the average body_mass_g per island.
grouped_by_island <- group_by(penguins, island)
summarise(grouped_by_island, avg_body_mass_g = mean(body_mass_g, na.rm = T))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 3 x 2
## island avg_body_mass_g
## <chr> <dbl>
## 1 Biscoe 4716.
## 2 Dream 3713.
## 3 Torgersen 3706.
If you haven’t done so already, try using the %>% operator to do this.
penguins %>%
group_by(island) %>%
summarise(avg_body_mass_g = mean(body_mass_g, na.rm = T))
## `summarise()` ungrouping output (override with `.groups` argument)
## # A tibble: 3 x 2
## island avg_body_mass_g
## <chr> <dbl>
## 1 Biscoe 4716.
## 2 Dream 3713.
## 3 Torgersen 3706.
Use the pipe operator (%>%) to do all the operations below.
penguins data so that it only includes Chinstrap or Adelie.sex to observed_sexspecies, observed_sex, bill_length_mm and bill_depth_mmbill_length_mm and bill_depth_mmTry to create the pipe step by step and execute code as you go to see if it works.
Once you are done, assign the data to new_penguins.
penguins %>%
filter(species %in% c("Chinstrap", "Adelie")) %>%
rename(observed_sex = sex) %>%
select(species, observed_sex, bill_length_mm, bill_depth_mm) %>%
mutate(ratio = bill_length_mm/bill_depth_mm) %>%
arrange(desc(ratio))
## # A tibble: 220 x 5
## species observed_sex bill_length_mm bill_depth_mm ratio
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 Chinstrap female 58 17.8 3.26
## 2 Chinstrap female 48.1 16.4 2.93
## 3 Chinstrap female 49.8 17.3 2.88
## 4 Chinstrap male 52 18.1 2.87
## 5 Chinstrap female 50.9 17.9 2.84
## 6 Chinstrap female 46.8 16.5 2.84
## 7 Chinstrap female 47.5 16.8 2.83
## 8 Chinstrap female 46.9 16.6 2.83
## 9 Chinstrap male 51.3 18.2 2.82
## 10 Chinstrap male 55.8 19.8 2.82
## # ... with 210 more rows
Calculate the average ratio by species and sex, again using pipes.
penguins %>%
group_by(island, sex) %>%
summarise(avg_body_mass_g = mean(body_mass_g, na.rm = T))
## `summarise()` regrouping output by 'island' (override with `.groups` argument)
## # A tibble: 9 x 3
## # Groups: island [3]
## island sex avg_body_mass_g
## <chr> <chr> <dbl>
## 1 Biscoe female 4319.
## 2 Biscoe male 5105.
## 3 Biscoe <NA> 4588.
## 4 Dream female 3446.
## 5 Dream male 3987.
## 6 Dream <NA> 2975
## 7 Torgersen female 3396.
## 8 Torgersen male 4035.
## 9 Torgersen <NA> 3681.
Count the number of penguins by island and species.
penguins %>%
count(island, species)
## # A tibble: 5 x 3
## island species n
## <chr> <chr> <int>
## 1 Biscoe Adelie 44
## 2 Biscoe Gentoo 124
## 3 Dream Adelie 56
## 4 Dream Chinstrap 68
## 5 Torgersen Adelie 52
Below is a dataset that needs some cleaning.
Use the skills that you have learned so far to turn the data into a tidy dataset.
animal_friends <- tibble(
Names = c("Francis", "Catniss", "Theodor", "Eugenia"),
TheAnimals = c("Dog", "Cat", "Hamster", "Rabbit"),
Sex = c("m", "f", "m", "f"),
a_opterr = c("me", "me", "me", "me"),
`Age/Adopted/Condition` = c("8/2020/Very Good", "13/2019/Wild", "1/2021/Fair", "2/2020/Good")
)
Start here:
tidy_animal_friends <- animal_friends %>%
## first clean the names
clean_names() %>%
## rename some variables
rename(adopter = a_opterr,
animals = the_animals) %>%
remove_constant() %>%
separate(age_adopted_condition, sep = "/", c("age", "year_adopted", "condition"))
tidy_animal_friends
## # A tibble: 4 x 6
## names animals sex age year_adopted condition
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Francis Dog m 8 2020 Very Good
## 2 Catniss Cat f 13 2019 Wild
## 3 Theodor Hamster m 1 2021 Fair
## 4 Eugenia Rabbit f 2 2020 Good
If you are done, turn the final data into long format.
tidy_animal_friends %>%
pivot_longer(cols = c(sex, age, year_adopted, condition))
## # A tibble: 16 x 4
## names animals name value
## <chr> <chr> <chr> <chr>
## 1 Francis Dog sex m
## 2 Francis Dog age 8
## 3 Francis Dog year_adopted 2020
## 4 Francis Dog condition Very Good
## 5 Catniss Cat sex f
## 6 Catniss Cat age 13
## 7 Catniss Cat year_adopted 2019
## 8 Catniss Cat condition Wild
## 9 Theodor Hamster sex m
## 10 Theodor Hamster age 1
## 11 Theodor Hamster year_adopted 2021
## 12 Theodor Hamster condition Fair
## 13 Eugenia Rabbit sex f
## 14 Eugenia Rabbit age 2
## 15 Eugenia Rabbit year_adopted 2020
## 16 Eugenia Rabbit condition Good